40 research outputs found
Learning and Transfer of Modulated Locomotor Controllers
We study a novel architecture and training procedure for locomotion tasks. A
high-frequency, low-level "spinal" network with access to proprioceptive
sensors learns sensorimotor primitives by training on simple tasks. This
pre-trained module is fixed and connected to a low-frequency, high-level
"cortical" network, with access to all sensors, which drives behavior by
modulating the inputs to the spinal network. Where a monolithic end-to-end
architecture fails completely, learning with a pre-trained spinal module
succeeds at multiple high-level tasks, and enables the effective exploration
required to learn from sparse rewards. We test our proposed architecture on
three simulated bodies: a 16-dimensional swimming snake, a 20-dimensional
quadruped, and a 54-dimensional humanoid. Our results are illustrated in the
accompanying video at https://youtu.be/sboPYvhpraQComment: Supplemental video available at https://youtu.be/sboPYvhpra
Stochastic Complementarity for Local Control of Discontinuous Dynamics
Abstract — We present a method for smoothing discontinuous dynamics involving contact and friction, thereby facilitating the use of local optimization techniques for control. The method replaces the standard Linear Complementarity Problem with a Stochastic Linear Complementarity Problem. The resulting dynamics are continuously differentiable, and the resulting controllers are robust to disturbances. We demonstrate our method on a simulated 6-dimensional manipulation task, which involves a finger learning to spin an anchored object by repeated flicking. I
Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo
We introduce MuJoCo MPC (MJPC), an open-source, interactive application and
software framework for real-time predictive control, based on MuJoCo physics.
MJPC allows the user to easily author and solve complex robotics tasks, and
currently supports three shooting-based planners: derivative-based iLQG and
Gradient Descent, and a simple derivative-free method we call Predictive
Sampling. Predictive Sampling was designed as an elementary baseline, mostly
for its pedagogical value, but turned out to be surprisingly competitive with
the more established algorithms. This work does not present algorithmic
advances, and instead, prioritises performant algorithms, simple code, and
accessibility of model-based methods via intuitive and interactive software.
MJPC is available at: github.com/deepmind/mujoco_mpc, a video summary can be
viewed at: dpmd.ai/mjpc.Comment: Minor fixes and formattin